/*******************************************************************************
Replication Materials for Blau, Kahn, Brummund, Cook, and Larson-Koester "Is 
There Still Son Preference in the United States?" Journal of Population
Economics, forthcoming.

ACS Sample Construction and Variable Definitions

Date Modified: 11/06/2019

*******************************************************************************/


*** Regression File Construction

The ACS regression file is constructed by merging spouse, child, parent, and 
unmarried partner characteristics to each ACS individual, creating one "wide" 
format dataset. Afterwards, country characteristic variables from auxiliary 
datasets are merged in to match to individuals, spouses, and parents from the
various source countries if the respective individual, spouse, or parent was
born in such country. We include year, serial, and pernum variables to allow 
users to merge in additional variables.

Only children aged less than 18 are included as children. Children characteristic 
variables are numbered (e.g. c_age1, c_age2, etc.), with each number corresponding 
to the characteristics for one child. Child numbers are ordered from oldest to 
youngest child.


*** Regression File Sample Restrictions

The ACS regression files drop observations with a same-sex spouse or unmarried 
partner, drop individuals younger than 18 or older than 64, drop individuals 
living in group quarters, and drop individuals with two or more children born in 
the same year and quarter. Other sample restrictions are made in the table code.


*** Main Sample Defining Variables

citizen : Categorical variable for born abroad to American parents (1), naturalized
	citizen (2), or not a citizen (3).
marst: marital status of the mother.
native : Indicator for native born.
bothfirmarr : Indicator for mother and father being in the first marriage.
nchild18 : Number of children less than 18 in the household.
oldc : Age of the oldest child in the household
nonrkids : Indicator for a household with step or adopted children.
sfrelate : Indicator for subfamilies.
father_sample : Indicator for father's in the headship sample.
female : Indicator for female.
foster_hh : Indicator that household contains foster children.
multi_sample : Indicator for the household containing twins, triplets etc.
gt1chld : Indicator for having at least two children.
gt2chld : Indicator for having at least three children.
nonusbirth : Household with a child born outside the US.

*** Other variables

hhwtnorm : Household weight variable normalized so that each sample year receives
	equal weight.
perwtnorm : Person weight variable normalized so that each sample year receives
	equal weight.
chld1 : Indicator for first born child being a girl.
chld2 : Indicator for second born child being a girl.
femhdalt : Indicator for unmarried female with at least one child.
lths (sp_lths) : Indicator for (spouse) education less than high school
scol (sp_scol) : Indicator for (spouse) education of some college
cold (sp_cold) : Indicator for (spouse) education of a college degree or more
genrace  (sp_genrace) : Categorical variable for (spouse) race, including White
	(non-Hispanic) (1), Black (non-Hispanic) (2), Hispanic (3), Asian (non-Hispanic)
	(4), and Other (non-Hispanic).
region : Categorical variable for Census region and division (9 categories).
year : Survey year
age (sp_age) : (Spouse) age. age2 and age3 refer to age squared and age cubed.
nchild : Number of children.
bpld : Mother birth country
yrsusa1 (sp_yrsusa1) : (Spouse) years in the USA. yrsusa2 refers to this variable squared.
sp_imm : Indicator for whether spouse is a immigrants.
marrno : Number of times mother married.
raced : Detailed race information, see IPUMS USA for further information.
empstat : Employment status.
spacing : The age of the second oldest child minus the age of the oldest child.
c_female* : Sex of the 1st, 2nd, 3rd and 4th child.
serial : Unique household identifier for each year.
pernum : Unique person identifier for each year when combined with serial.

*** Parent Source Country Variables (See Data Appendix for Sources)

m_igdp00_07 : Mother source country GDP average
m_isexratio00_07 : Mother source country sex ratio at birth average (2000-2007)
m_ilfp00_07 : Mother source country labor force participation average (2000-2007)
m_ifert00_07 : Mother source country fertility average (2000-2007)
m_iscore00_07 : Mother source country Gender Gap Index (2000-2007)

f_igdp00_07 : Father source country GDP average
f_isexratio00_07 : Father source country sex ratio at birth average (2000-2007)
f_ilfp00_07 : Father source country labor force participation average (2000-2007)
f_ifert00_07 : Father source country fertility average (2000-2007)
f_iscore00_07 : Father source country Gender Gap Index (2000-2007)







